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ABSTRACT 



An output translator provides for cross module representa- 
tions of components within a heterogeneous program by 
translating modifying a platform-neutral intermediate rep- 
resentation (IR) of the program into platform-specific 
instructions for different architectures. The intermediate 
representation is hierarchy of base elements that correspond 
to instructions, code blocks, procedures and components 
within the program. Blocks of instructions that were origi- 
nally written for one architecture can be translated from the 
intermediate representation into platform-specific instruc- 
tions for a different architecture. The output translator pro- 
vides any necessary code to interface contiguous code 
blocks that are emitted in different instruction sets. 

24 Claims, 11 Drawing Sheets 
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CROSS MODULE REPRESENTATION OF 
HETEROGENEOUS PROGRAMS 

RELATED APPLICATIONS 

The present application is related to U.S. Patent applica- 
tions entitled "Translation And Transformation of Hetero- 
geneous Programs" (U.S. patent application Ser. No. 09/343, 
805), "Instrumentation and Optimization Tools for 
Heterogeneous Programs" (U.S. patent application Ser. No. 
09/343,298), "Application Program Interface for Transform- 
ing Heterogeneous Programs" (U.S. patent application Ser. 
No. 09/343,276), and "Shared Library Optimization for 
Heterogeneous Programs" (U.S. patent application Ser. No. 
09/343,279), filed on the same day as the present application 
and assigned to the same assignee. 

FIELD OF THE INVENTION 

This invention relates generally to programming tools, 
and more particularly to translating code between computer 
architectures. 

COPYRIGHT NOTICE/PERMISSION 

A portion of the disclosure of this patent document 
contains material which is subject to copyright protection. 
The copyright owner has no objection to the facsimile 
reproduction by anyone of the patent document or the patent 
disclosure as it appears in the Patent and Trademark Office 
patent file or records, but otherwise reserves all copyright 
rights whatsoever. The following notice applies to the soft- 
ware and data as described below and in the drawings 
hereto: Copyright© 1998, Microsoft Corporation, All Rights 
Reserved. 

BACKGROUND OF THE INVENTION 

In a new programming paradigm, a program is now a 
collection of components. Each component publishes an 
interface without exposing its inner details. Thus, a compo- 
nent can internally exist in any form: Intel x86 binary, Intel 
IA-64 binary, Visual Basic (VB) byte codes, Java class files, 
or any Virtual Machine (VM) binary. A heterogeneous 
program consists of components in different forms. Hetero- 
geneous programs already exist in some environments: in 
the Microsoft Windows 32-bit environment, a Visual Basic 
program is compiled into VB byte codes that can call 
native-compiled functions in a separate dynamic linked 
library. Similarly Java class files can call native functions. 
Intel's IA-64 architecture allows IA-64 code to co-exist with 
x86 code. 

To understand the behavior of a heterogeneous program, 
all its components, regardless of their form, have to be 
instrumented and analyzed in the same framework, 
otherwise, only partial information will be collected. It is 
important to note that systems that have been ported to 
several architectures are not sufficient to handle heteroge- 
neous programs. For example, a system for VB byte codes 
that has been ported to x86, cannot provide a complete 
execution time analysis of a heterogeneous program con- 
sisting of VB byte codes and native x86 because each system 
operates in isolation on its own input. 

Further, a heterogeneous program may consist of hetero- 
geneous components. A heterogeneous component is a 
single component consisting of routines in different instruc- 
tion sets. As the interface is well defined, components 
internally can use any instruction set. Each instruction set 
has its own advantages such as execution time, portability, 
and size. 
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All previous systems have been designed for homoge- 
neous programs: conventional programs consisting of com- 
ponents in the same form. Some systems have been targeted 
to different architectures, but cannot work with heteroge- 

S neous programs. None of these systems can generate a 
heterogeneous component. 

A large number of systems have been developed to help 
analyze and optimize homogeneous programs. The creation 
of "Pixie" by MIPS Computers Systems, Inc. in 1 986 started 

10 a class of basic block counting tools by inserting pre- 
determined sequence of instructions to record execution 
frequencies of basic blocks. "Epoxie" extended the tech- 
nique by using relocations to eliminate dynamic translation 
overheads. David W. Wall. Systems for late code 

15 modification, in Code Generation — Concept, Tools 
Techniques, pp. 275-293, (Robert Giegrich and Susan L. 
Graham, eds, 1992). "QPT" further extended the technique 
by constructing spanning trees to reduce the number of basic 
blocks that are instrumented. James Larus and Thomas Ball, 

20 Rewriting executable files to measure program behavior, 
Software, Practice and Experience, vol. 24, no. 2, pp 
197-218 (1994). "Purify" instruments memory references to 
detect out-of-bounds memory accesses and memory leaks. 
Reed Hastings and Bob Joyce, Purify: Fast Detection of 

25 Memory Leaks and Access Errors, Proceedings of Winter 
Usenix Conference, January 1992. 

"OM" allowed general transformations to be applied to a 
binary by converting the binary to an intermediate repre- 
sentation that can be easily manipulated. Amitabh Srivastava 

30 and David Wall, A Practical System for Intermodule Code 
Optimization at Link Time, Journal of Programming 
Language, 1(1): 1-18 (1993). OM has been implemented on 
MIPS, DEC Alpha and Intel x86 architectures. "EEL" uses 
a similar technique and provides an editing library for Sun 

35 SPARC architectures. James R. Larus and Eric Schnarr, 
EEL: Machine-Independent Executable Editing, Proceed- 
ings of SIGPLAN' 95 Conference on Programming Lan- 
guage Design and Implementation (1995). "Alto" and 
"Spike" are optimizers for the DEC Alpha architectures. K 

40 De Bosschere and S. Debray, Alto: a Link-Time Optimizer 
for the DEC Alpha. Technical Report TR-96-16, Computer 
Science Department, University of Arizona (1996). David 
W. Goodwin, Interprocedural Dataflow Analysis in an 
Executable Optimizer, Proceedings of SIGPLAN' 97 Con- 

45 ference on Programming Language Design and Implemen- 
tation (1997). 

"ATOM" extended OM by providing a flexible instru- 
mentation interface for the DEC Alpha and Intel x86 sys- 
tems. Amitabh Srivastava and Alan Eustace, ATOM: A 

so System for Building Customized Program Analysis Tools, 
Proceedings of SIGPLAN' 94 Conference on Programming 
Language Design and Implementation (1994). However, 
ATOM does not allow modifications to a binary. "Etch" 
provided a similar system for x86 and "BIT" for Java byte 

55 codes. T. Romer, G. Voelker, D. Lee, A. Wolman, W. Wong, 
H. Levy, B. Chen, and B. Bershad, Instrumentation and 
Optimization of Win32/Intel Execu tables Using Etch, Pro- 
ceedings of the USENIX Windows NT Workshop (1997). 
Han Lee and Benjamin Zorn, BIT: A Tool for instrumenting 

60 Java bytecodes. Proceedings of the 1997 USENIX Sympo- 
sium on Internet Technologies and Systems (1997). 

None of these systems work on heterogeneous programs. 
Some of them have been ported to multiple architecture but 
they provide only a partial view when applied to heteroge- 

65 neous programs as each implementation operates on its input 
in isolation. Although OM builds a symbolic representation, 
the representation was primarily designed for applying arbi- 
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trary transformations and is not sufficient to handle hetero- FIG. 3 is a diagram of an intermediate representation 

geneous programs. None of these systems can generate hierarchy used by the exemplary embodiment of FIG. 2 A; 

heterogeneous components. ATOM provides a flexible inter- FIG. 4A is a flowchart of an output translator method to 

face for instrumentation only. be performed by a computer according to an exemplary 

Because a heterogeneous program provides efficiencies 5 embodiment of the invention; 

that cannot be achieved by a homogeneous program, a FIGS. 4B and 4C are flowcharts of details of the exem- 

mechanism is needed that can convert portions of a homo- piary embodiment of the output translator method of FIG. 

geneous program into a different instruction set to optimize 4A; 

the execution of the program or produce more compact code . FIG. 5 is a diagram of a redirected block created by the 

Furthermore, the ability to apply the same mechanism to an io output translator method of FIG. 4A; 

existing heterogeneous program to produce further optimi- piGS 6A ^ 6R m flow changes 

zation is also desirable. created by ^ outpm translator met hod of FIG. 4A; and 

SUMMARY OF THE INVENTION FIG. 7 is a diagram of an emitted block data structure 

rrm ■ j . _ . j. j , « created by the output translator method of FIG. 4A. 

The above-mentioned shortcomings, disadvantages and 15 7 * 

problems are addressed by the present invention, which will DETAILED DESCRIPTION OF THE 

be understood by reading and studying the following speci- INVENTION 

fication. . , - - In the following detailed description of exemplary 

An output translator provides for cross module represen- embodiments of the invention, reference is made to the 

tations of components within a heterogeneous program by accompanying drawings which form a part hereof, and in 

enabling a code block in a component to be translated from which is shown by way of juustratioa specific exemplar)' 

an platform-neutral intermediate representation of the pro- embodiments in which the invention may be practiced, 

gram into a set of platform-specific instructions that are embodiments arc described in sufficient detail to 

directed to a different architecture than that for which the enaMe ^ skmed in the art to pracdce ^ iQventioil) and 

code block was originally written. The output translator it is to be understood that other embodiments may be utilized 

provides any necessary prologue and/or epilogue code to and that j ical> mechanical, electrical and other changes 

interface contiguous code blocks that are emitted in different may bc made departing from the spirit or X0J)C of 

instruction sets. For an architecture that has both short and the present invention . ^ following detailed description is, 

long forms for instructions, one aspect of the output trans- therefore, not to be taken in a limiting sense, and the scope 

later initially assumes the emitted instruction will be in its of the pfesent mvention fc defined only by me app ended 

short form and only substitutes the long form when required. claims 

The output translator also enables the substitution of one - . . - . t . . , . , - . , 

. . , , c . A . ii - . j. A . . The detailed descnption is divided into four sections. In 

code block for another, automatically adjusting entry points .« « 4 « r - , 

. , OL . , / J . , i ?.£ the orst section, the hardware and the operating environment 

as required. Changes m the order of code blocks withm the .. ' ... ... u a- * e *u • 

n & - - i_ - j x- 35 in conjunction with which embodiments of the mvention 

component are accommodated while preserving, and opti- , J , , , T , 

« i . . iL • • i , , « may be practiced are described. In the second section, a 

mally optimizing, the original control flow. ^ ^ of ^ {& pr ^ ntcd In the 

Because the architecture of a code block can be changed ^ met hods and data structures for an exemplary 
when it is translated from the intermediate representation, a embodiment of the invention are provided. Finally, in the 
user can create a more efficient heterogeneous program from 4Q fourth ^on, a conclusion of the detailed description is 
a homogeneous program or can optimize an existing net- provided 
erogeneous program by specifying an architecture that sup- 
plies a desired characteristic, such as speed or compactness Hardware and Operating Environment 
of code. Even without changing platform, the output trans- mQ x fa a ^ mvm of the hardware and operating 
lator can produce more compact code than originally gen- 45 environment m conjunction with which embodiments of the 
erated by a compiler because the output translator uses short invention may be practiced. The description of FIG. 1 is 
forms for instructions as a default size where the majority of intended to provide a brief, general description of suitable 
compilers default to long forms. Additionally, the output computer hardware and a suitable computing environment in 
translator can emit instructions for a new platform for which conjunction with which the invention may be implemented, 
a compiler is not yet been written, allowing early testing and 5Q Mihough not requ ired, the invention is described in the 
evaluation of the architecture. general context of computer-executable instructions, such as 

The present invention describes systems, clients, servers, program modules, being executed by a computer, such as a 

methods, and computer-readable media of varying scope. In personal computer. Generally, program modules include 

addition to the aspects and advantages of the present inven- routines, programs, objects, components, data structures, 

tion described in this summary, further aspects and advan- 5S ctc #> ma t perform particular tasks or implement particular 

tages of the invention will become apparent by reference to abstract data types. 

the drawings and by reading the detailed description that Moreover, those skilled in the art will appreciate that the 

follows. invention may be practiced with other computer system 

BRIEF DESCRIPTION OF THE DRAWINGS configurations, including hand-held devices, multiprocessor 

7 t 60 systems, microprocessor-based or programmable consumer 

FIG. 1 is a diagram of the hardware and operating electronics, network PCs, minicomputers, mainframe 

environment in conjunction with which embodiments of the computers, and the like. The invention may also be practiced 

mvention may be practiced; in distributed computing environments where tasks are 

FIG. 2A is a diagram illustrating a system-level overview performed by remote processing devices that are linked 

of an exemplary embodiment of the invention; 65 through a communications network. In a distributed com- 

F1GS. 2B, 2C and 2D are diagrams illustrating additional puting environment, program modules may be located in 

details of the processes shown in FIG. 2A; both local and remote memory storage devices. 
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The exemplary hardware and operating environment of router, a network PC, a client, a peer device or other 

FIG. 1 for implementing the invention includes a general common network node, and typically includes many or all of 

purpose computing device in the form of a computer 20, the elements described above relative to the computer 20, 

including a processing unit 21, a system memory 22, and a although only a memory storage device 50 has been illus- 

system bus 23 that operatively couples various system 5 trated in FIG. 1. The logical connections depicted in FIG. 1 

components, including the system memory 22, to the pro- include a local-area network (LAN) 51 and a wide-area 

cessing unit 21. There may be only one or there may be more network (WAN) 52. Such networking environments are 

than one processing unit 21, such that the processor of common place in offices, enterprise-wide computer 

computer 20 comprises a single central-processing unit nelworks ^ mtranets and me Intemet 

(CPU), or a plurality of processing units, commonly referred 1rt . . ^ 

to as a parallel processing environment, lie computer 20 ] ° When used * » LAN-networking envuonment, the com- 

may be a conventional computer, a distributed computer, or P uler ™ » ~™ected to the local network 51 through a 

any other type of computer; the invention is not so limited. network . UJterface J or ad f £ er 53 > wh,< * ' s ° ae t *P° . of 

i, , m *%*% c w n_ communications device. When used m a WAN-networking 

The system bus 23 may be any of several types of bus . . * A . . , , , & 

. . • t j * u ♦ 11 environment, the computer 20 typically includes a modem 

structures including a memory bus or memory controller, a 1<; mA l / • ♦ e 

. . , . * - . . 9 r - * c u 35 54, a type of communications device, or any other type of 

peripheral bus, and a local bus usmg any of a variety of bus . . . , . f . ... /. J 

i5* ^ * % u c A*, communications device for estabhshing communications 

architectures. The system .memory may also be referred to ,as Qvt[ ^ ^ m J2 sucfa as ^ ^ 

simply the memory, and mcludes read only memory (ROM) , ,. . . . . ' 4 . . 

~ A r \ , J /n a^x\ ^ a u • • /j modem 54, which may be internal or external, is connected 

24 and random ac^ss memory (RAM) 25. A basic input the tem bus ^ the ^ rt 46 In a 

output system (BIOS) 26, contammg the basic routines that - n . . ' . . *, , , . , - . .. 

ui * . r • r i_ . i . • .u 20 networked environment, program modules depicted relative 

help to transfer information between elements within the . 1 ; r * 4 - - . 

r . - A , j • . ^ 4 A . nrw/tiA to the personal computer 20, or portions thereof, may be 

computer 20, such as during start-up, is stored m ROM 24. , ; F L r \ r * . T , . ' • . % 

rjyi / 1ft , . ■ i j i. a a' i a ' -»t f stored m the remote memory storage device. It is appreciated 

The computer 20 further includes a hard disk drive 27 for . . . * . , tL 

r , ... . « j j ■ i *u that the network connections shown are exemplary and other 

reading from and writing to a hard disk, not shown, a c . . .. , - r A . v L • 

*• j- i j • T j- r . means of and communications devices lor establishing a 

magnetic disk drive 28 for reading from or writing to a 0< . iL 4 . , 

° L1 4 . .. . „ rt j *• ij-i j- 25 communications hnk between the computers may be used, 

removable magnetic disk 29, and an optical disk drive 30 for r } 

reading from or writing to a removable optical disk 31 such ^ hardware and operatmg environment in conjunction 

as a CD ROM or other optical media ^ wbich embodiments of the invention may be practiced 

The hard disk drive 27, magnetic disk drive 28, and has u be f. n descril l e( | ^ com P uter m conjunction with which 

optical disk drive 30 are connected to the system bus 23 by 30 embodiments of the invention may be practiced may be a 

a hard disk drive interface 32, a magnetic disk drive inter- conventional computer, a distributed computer, or any other 

face 33, and an optical disk drive interface 34, respectively. ^ e of computer, the invention is not so limited. Such a 

The drives and their associated computer-readable media c ° m P utcr ^ lc ^J eludes one or more processing umts as 

provide nonvolatile storage of computer-readable lts Pressor, and a computer-readable medium such as a 

instructions, data structures, program modules and other 35 computer may ako mclude a communications 

data for the computer 20. It should be appreciated by those d ™ ce mch 35 a netwo , rk ada P ter or ^odem, so that it is 

skilled in the art that any type of computer-readable media able to communicatively couple to other computers, 

which can store data that is accessible by a computer, such Svstem Level Overview 
as magnetic cassettes, flash memory cards, digital video 

disks, Bernoulli cartridges, random access memories 4Q A system level overview of the operation of an exemplary 

(RAMs), read only memories (ROMs), and the like, may be embodiment of the invention is described by reference to 

used in the exemplary operating environment. FIGS. 2A-D. A heterogeneous program contains multiple 

A number of program modules may be stored on the hard executable components, such as main program code and 

disk, magnptic disk 29, optical disk 31, ROM 24, or RAM shared libraries, written for different computer architectures 

25, including an operating system 35, one or more applica- 45 (platforms) or programming languages. FIG. 2A shows a 

tion programs 36, other program modules 37, and program system 200 that translates and transforms components in a 

data 38. A user may enter commands and information into heterogeneous program. The system 200 comprises an input 

the personal computer 20 through input devices such as a translator (reader) 210, a transformation module 230, and an 

keyboard 40 and pointing device 42. Other input devices output translator (writer) 240. All three modules work with 

(not shown) may include a microphone, joystick, game pad, 50 a high-level abstraction of a heterogeneous program, 

satellite dish, scanner, or the like. These and other input referred to as an "intermediate representation" (IR) 220. The 

devices are often connected to the processing unit 21 IR is a set of pseudo -instructions for a stack-based logical 

through a serial port interface 46 that is coupled to the machine with an unlimited number of registers that represent 

system bus, but may be connected by other interfaces, such tbc functionality of the heterogeneous program, 

as a parallel port, game port, or a universal serial bus (USB). 55 The reader 210 creates an IR 220 from an executable 

A monitor 47 or other type of display device is also component (EXE) 201. The reader 210 is a two-stage 

connected to the system bus 23 via an interface, such as a process as shown in FIG. 2B. First, the executable 201 is 

video adapter 48. In addition to the monitor, computers parsed 211 into its basic blocks of code and data using 

typically include other peripheral output devices (not information provided in a program database file (PDB) 202. 

shown), such as speakers and printers. 60 As well-known in the art, a basic code block is defined as a 

The computer 20 may operate in a networked environ- code block having a single entry point and a single exit 

ment using logical connections to one or more remote point. In an alternate embodiment, all the work performed 

computers, such as remote computer 49. These logical by the parser 211 is input directly into the second stage of the 

connections are achieved by a communication device reader 210, thus skipping the parsing process, 

coupled to or a part of the computer 20; the invention is not 65 Once the code and data blocks are identified, an IR 

limited to a particular type of communications device. The creation process 212 evaluates each platform-dependent 

remote computer 49 may be another computer, a server, a instruction on a block-by-block basis. There are very large 
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set of common instructions regardless of architecture, i.e., to the IR to navigate through the IR and to make changes, 

move, store, add, etc., that can be represented by a single such as moving blocks between procedures, modifying 

platform-neutral IR instruction. For RISC (reduced instruc- blocks, rearranging the logical connections between blocks, 

tion set computer) architectures, most, if not all, instructions m $ changing the platform-specific instruction set for a code 

can be easily translated into a single platform-neutral IR 5 block. The tools 231 are described in detail in the related 

instruction. On the other hand, CISC (complex instruction "Instrumentation and Optimization Tool" patent application, 

set computer) architectures, such as the Intel x86 family, The API 250 is described in detail in the related "Application 

contain complex instructions that provide the function of ^ Int erface" pateDt application, 

multiple instructions. In one exemplary embodiment, the _ . . , wn . , M . , 

platform-dependent instructions that have a single platform- ln B 7 "^tnimenting the R using me tools 231, the user can 

neutral IR instruction counterpart are translated into that 10 now watch the mterrelationship between me vanous com- 

platform-neutral instruction, while complex instructions are P° n f * of / ^teipgeaeous program and determine if a 

replicated as-is within the IR through an extended version of blo <* of contamed in one component is heavily used by 

the basic IR instruction. A replicated complex instruction is f other component and therefore that block of code should 

marked with a signature that denotes its architecture. The „ be tn ° ved out of filst component and placed into the 

output translator 240 recognizes a signed complex instruc- 15 """J ™?ponent to speed up execution This process is 

tion and processes it as described further below. In an describ , ed m ^tail in the re ated "Shared Library Optuni- 

alternate embodiment, a complex instruction is represented zauon P atent a P| llc f on - Alternately the user may decide 

by a set of platform-neutral IR instructions that perform the lo c °Py> u,stead of ^e the code into the second 

equivalent function. „ compon«it, a process referred to in the art as "code repfc- 

a* *u • * ^- • *u a ui 1 u u cation. A common optimization technique called "lnlining^ 

After the instructions in the code blocks have been utilizes code re lication 

translated, the IR creation process 212 creates a logical " 

hierarchical view of the executable 201 as illustrated in FIG. ^ transformed IR is now input into the output translator 

3. All architectures share the basic concepts of instructions 240 - ^ out P ut translator 240 operates on the IR in two 

305, code blocks 304, data blocks 306, components 302, and 25 P hases as shown in mG - 2D: a hnker P hase 241 that ^solves 

procedures 303, so the IR hierarchy 300 enables the user to the lo & ical connections into absolute addresses in an address 

understand the structure of the intermediate representation s P ace for a modified version of the executable, and a writer 

of a heterogeneous program 301. The code blocks are phase 242 that assembles the IR into the modified version of 

logically connected as specified in the EXE file 201 so that the executable (EXE) 203. The blocks m the executable 203 

the blocks can be more easily manipulated during the 30 can be emitted by the writer 242 for their original platform, 

transformation process 230. Procedures are determined by or can be emilted for a different platform, 

following the logical connections using information pro- The linker 241 must maintain the semantics of the code of 

vided in the PDB file 202. Procedures are collected together the hierarchy when resolving the addresses, i.e., preserve the 

to create the program components. Little or no optimization logical connections between blocks and the location of 

of the program is performed by the creation process 212 35 referenced data. The linker 241 determines the size of each 

since it is desirable that the intermediate representation be as code block based on the length of each instruction in the 

close to what the programmer originally wrote as possible. block. The linker 241 is also responsible for adding when- 

However, tracing the logical connections to determine the cve r prologue and epilogue code necessary to "glue" 

procedures can result in more procedures being created than together contiguous blocks that will be assembled into 

originally coded by the programmer as described in the 4Q different platform-dependent instructions. As part of the 

related "Translation and Transformation" patent application. address resolution, the linker 241 also can perform limited 

Therefore, the creation process 212 annotates, or code modification or optimization. For example, assume that 

"decorates," the hierarchy 300 with the user names supplied P rior to ^e transformation process 230, there was a jump 

in the symbol table for the EXE 201. The annotations enable between two code blocks, but those blocks are now con- 

the user to understand how the IR control flows and how the 45 tiguous. I n this case, the linker 241 removes the now- 

elements of the IR hierarchy correspond to the procedures unnecessary jump and lets the logic flow fall through to the 

and the components in the original code so the appropriate second block. Because the hierarchy extends down to the 

transformations can be applied to the IR. The annotations are instruction level and is consistent regardless of the manipu- 

maintained in data structures for the procedures during the lation performed by the user, the linker 241 has more 

transformation process and output by the output translator 50 knowledge of the placement of instructions than did the 

240. programmer. Thus, in architectures in which instructions 

At the end of the creation of the IR hierarchy, all instruc nave both a long and short form depending on the location 

uons are represented in the hierarchy as IR instructions ^ey are addressing, the linker 241 chooses the appropriate 

within code blocks so that there is no differentiation between instruction size, which can be a better choice than that 

code written for one platform and code written for a second 55 originally made by the programmer, 

platform. The creation of the IR and an exemplary embodi- The writer 242 assembles each IR instruction into its 

ment of the IR hierarchy are described in detail in the related platform-dependent counterpart based on the architecture 

"Translation and Transformation" patent application. specified in the code block. In an exemplary embodiment in 

Once the intermediate representation is complete, the user which complex instructions are replaced in the IR, if the 

is allowed to manipulate the code and data (illustrated by the 60 complex instruction is being written to the same platform, 

IR transformation module 230) through an application pro- the writer 242 merely emits the instruction. If the complex 

gram interface (API) 250. The exemplary embodiment of the instruction is designated to be translated into a different 

system 200 provides some pre-defined tools 231 (FIG. 2C) architecture, the writer 242 creates the appropriate set of 

used to instrument and optimize the IR that are guaranteed platform-specific instructions to perform the same function 

to be safe in that the tools will evaluate a change requested 65 as me original, complex instruction, 

by the user and only manipulate the code in an appropriate As part of the EXE' 203, the writer 242 creates an emitted 

manner. The API 250 also permits the user direct access 232 block information data structure containing the annotations 
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created by the reader process 210 for each block in the mation contained within the IR hierarchical elements that 

executable. This allows the EXE' 203 to be iterated through represent instructions, blocks, procedures and components 

the entire process 200 as many times as desired (represented (as described in the related "Translation And Transforma- 

by phantom arrow 260 and described in the related "Trans- lion'' patent application) to determine what processing is 

lation and Transformation" patent application), while 5 required. In the exemplary embodiment, the acts represented 

enabling the user to distinguish the original procedures from b y blocks 401-404 are performed by the linker module 241 

those added in a previous iteration. In an alternate shown ™ mG - 2D > whiIe ^ represented by blocks 

embodiment, the emitted block information is combined 405 - 406 ^J*??™ 6 ^ module 242. The acts 

with the PDB file 202 to create a new version of the program ^presented by block 403 are described in detail with refer- 

database file (PDB') 205 (shown in phantom). The output 10 ence t0 mGS - 48 and 4C 

translation process 240 is described in detail in the related ^ 'inker/writer method 400 begins by determining if the 

"Cross Module Representation" patent application. ^ has substituted any new code blocks for original code 

In an alternate exemplary embodiment of the translation b . locks in ^"W™* d ™ a * * a jf orma : 

and transformation system 200 not illustrated, the IR con- UoD ( < FI f G - 2 *> aS mustra £» "> FIG - * to «* 

taining the absolute ^dresses assigned by the linker 241 is is "^f™ • th * transformation process 230 leaves ^deblock 

used as input into the IR creation process 212 for further S K 01 * *?. f ?° T J " a °t 

iteration through the system 200. One of skill in the art will me J*"*"*" method 400 wafcs tough 

immediately appreciate that much of the work performed by we lR, finmng each entry point reference 502 to block A501 

the creation process 212 as described above can be skipped re P 1 < ? an g ^ " V™^*?* 5 °l mlh . ^ ?V 

when iterating the modified IR through the system 200. This 20 f "ence 504 * bl ° ck ***** ^alternate embodi- 

, ,. t „ . r , ment also shown in FIG. 5 in which block A is supplemented 

embodiment allows the user to transform a heterogeneous . A . , „ c u 

^ rtMlvl • _ u»*,;„„ i~ «ii fu« l nn « but not wholly replaced in the transformation process, a call 

program in stages rather than having to make all the changes ' . r r ' a™ 

f« f.- i~ # u a * ->na tl, „« f „ m to block A 501 is redirected to block B 503. Block B 503 

in a single pass through the system 20U. Ine system level . , P . ,. .... , , 

. f 4ti 4? t uj- «. r performs its functions, concluding with a jump 505 back to 

overview of the operation of an exemplary embodiment of £, , . . . ' , * l 

... . . j •< j ■ .« • r *u o< block A 501 so the contents of block A 501 can be executed. 

the mvention has been described in this section of the 25 . , 

detailed description. A translation and transformation system n bo,b thescembodimcnts, block A 501 remains in he IR. 

, . f . . . . . . i - . In yet another alternate embodiment, the linker/writer 

translates a binary component into an intermediate * , , , , , , . mM ' ' , , , , 

... 1- ■ . _r method replaces code block A 501 with code block B 503, 

representation, provides an application program interface . . * F % A M , \ „ , . . 

*u u u u * 4: »u • * j ■ 4 making block A a "phantom block, 

through which a user can transform the intermediate . 

representation, and translates the intermediate represents 30 Also at block 401, the linker/writer method automaUcally 

tion as transformed by the user into a modified version of the rebuilds the accessary data structures that are specific to the 

binary. While the invention is not limited to any particular bmary output file format. For example, regarding Win32 PE 

arrangement of modules, for sake of clarity exemplary set of (Portable Executable) images, the imports exports, and 

modules has been described. One of skill in the art will ^ local stora g e f ctlons L « reblIllt the 

readily recognize that the functions attributed to the modules 35 information stored in the IR hierarchy data structures, 

described in this section can be assigned to different modules ^ linker/writer method 400 next performs a consistency 

without exceeding the scope of the invention. Furthermore, check on the code blocks in ^ IR t0 preserve the semantics 

although the translation and transformation of only one input of me original control flow among the code blocks (block 

component (EXE 201) has been illustrated and described 403 )- Referring to an example provided by FIG. 6A, blocks 

above, the system can take multiple components, and *o A 601, B 602, C 603, and D 604 were originally arranged as 

accompanying PDB files, as input. shown at 600 with the control flowing from A 601 to B 602 

to C 603 to D 604. In the IR 610, the block order is A 601, 
Methods of Exemplary Embodiments of the D 6 04, B 602 and C 603. Therefore, logical linkages are 
Invention created at block 403 to maintain the original control flow 
In the previous section, a system level overview of the 45 through blocks as shown. Certain optimizations also can be 
operations of exemplary embodiments of the invention was performed at block 403, such as eliminating the jump 
described. In this section, the particular methods performed between blocks B 602 and C 603 in the IR 610. 
by a computer executing such exemplary embodiments are The linker/writer method 400 also determines if two 
described by reference to a series of flowcharts. The meth- contiguous code blocks in the component are to be emitted 
ods to be performed constitute computer programs made up 50 for different platforms and creates the prologue and epilogue 
of computer-executable instructions. Describing the meth- glue code necessary to transition from one instruction set to 
ods by reference to a flowchart enables one skilled in the art another as illustrated in FIG. 6B. In the original component 
to develop such programs including such instructions to 620, blocks A 601, B 602, and C 603 were all written for the 
carry out the methods on a suitable computer (the processor same architecture. However, during the transformation pro- 
of the computer executing the instructions from computer- 55 cess 230, the user marked code block B 602 to be translated 
readable media). FIGS. 4A-C illustrate the acts to be into a different instruction set to produce code block B' 605 
performed by a computer executing an exemplary embodi- in the output binary. The linker/writer method 400 inserts the 
ment of a linker/writer method that performs the output IR instructions for prologue B' 606 and epilogue B' 607 
translator process 240 shown in FIGS. 2A and 2D. A before and after the IR instructions for block B' 605. 
heterogeneous program contains at least one component, 60 Assume, for illustration purposes, that the original compo- 
shown in FIG. 2A as EXE 201, that is translated into an IR nent was written in the Intel x86 instruction set but that the 
hierarchy 220. After any desired transformations 230 are user determined that block B 602 would be more optimally 
performed on the IR hierarchy 220 by a user, the exemplary written as byte codes for a virtual machine (VM). The IR 
embodiment of the linker/writer method 400 takes each block hierarchical element is changed to reflect the new 
component at a time and translates the IR instructions in the 65 architecture. As it walks through the blocks of the 
component into platform-specific instructions as shown in component, the linker/writer method 400 keeps track of 
FIG. 4A. The linkerAvriter method 440 relies on the infor- whether a block is to be emitted for a different platform than 
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that for which it was originally written. The amount of work 
needed for such a transition is directly proportional to how 
different the two architectures are. In the present example, 
IR instructions representing a call to the interpreter for byte 
codes is added as the prologue B' 606, and this becomes a 
new entry point (entry') 609 for the block. The call to the 
interpreter allows the interpreter to get the starting address 
of the byte codes relative to the beginning of the component 
630. Calls from other x86 code blocks to the block B' 605 
can skip the call to the interpreter in the prologue B' 606 and 
jump directly into the byte codes at entry point 608 (native 
entry point). 

The prologue B' 606 and epilogue B' 607 perform the 
necessary register mapping between the two architecture and 
adjust the shared stack upon entry and exit from the block B' 
605 when the output binary is executed. 

The linker/writer method 400 also preserves the semantics 
of the code flow in architectures in which some instructions, 
such as a loop instruction, have only a short form and thus 
cannot reference addresses past a certain relative distance. If 
in the process of doing a transformation on the IR, additional 
code was inserted in between blocks in the IR so that a loop 
instruction can no longer operate properly, block 402 of the 
linker/writer method 400 inserts indirection code to allow 
the short form to finction while maintaining the block 
structure that was introduced by the transformation. In one 
embodiment, the indirection code comprises a block within 
the short form addressing range that contains a long form 
jump instruction to the loop instruction's original target. The 
loop instruction addresses the jump instruction, and the 
jump instruction jumps to the target block. 

Now that all the blocks have been properly linked and any 
necessary glue code added, the linker/writer method 400 
assigns absolute addresses in an address space for the 
modified executable EXE' 203 (FIG. 2A) to the blocks in the 
IR and resolves the logical connections to the appropriate 
absolute addresses (block 403). In the exemplary 
embodiment, the processing at block 403 begins by initially 
assigning an optimal fixed size to each code block whose 
size could fluctuate (due to certain instruction long and short 
formats). The size for a data block is fixed and will never 
change. The optimal fixed size for a code block is computed 
based on the average number of instructions per block 
multiplied by the average instruction size. In one 
embodiment, the average number of instructions per block is 
3.2 and the average instruction size is 2.5 bytes, giving an 
optimal fixed size of 8 bytes. This initial size assignment 
allows the linker/writer method 400 to perform forward 
referencing on blocks that have not yet been assigned 
working or absolute addresses as described next. 

The absolute address assignment is performed through 
two major processing loops. The first or "priming" loop is 
illustrated in FIG. 4B. The priming loop assigns working 
addresses to the blocks that represent a displacement relative 
to the start of the component to initially approximate the 
block address. For those instructions that have both a long 
form and a short form ("relative instructions"), the priming 
loop assigns the short size. After the priming loop has 
assigned working addresses to each of the blocks, the second 
or "verification" loop illustrated in FIG. 4C, reevaluates the 
size of the relative instructions and creates the absolute 
addresses. Some relative instructions that were originally 
assumed to be short form may need to change to long form 
because their target reference is now further away than 
originally calculated using the initial average sizes for the 
blocks. Once the priming loop has assigned fairly accurate 
sizes to the blocks, those relative instructions that are likely 
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to require the long form can be determined by the verifica- 
tion loop with little or no error. 

Turning now to FIG. 4C, the priming loop begins by 
setting a "currentaddress" variable to the starting address of 

5 the component (block 410). Each block in the component is 
examined (block 411) and assigned a starting address equal 
to the current value in the currentaddress variable (block 
412). If the block is a data block (block 413), the curren- 
taddress variable is recalculated to be at the end of the data 

10 block using the pre-determined size for a data block (block 

414) . 

If the block is a code block (block 413), each instruction 
in the code block is examined (block 416) to determine if it 
is a relative or regular instruction (block 417). A relative 

15 instruction is assigned the short form size (block 418). The 
size of a regular instruction is calculated (block 419). The 
size of each instruction is stored in the IR element that 
represents the instruction. In the exemplary embodiment, the 
short form size is two bytes and the calculation of a regular 

20 instruction is performed by calling the appropriate assembler 
to emit the instruction so the actual size of the instruction 
can be determined. The appropriate assembler is specified in 
the IR block element; the calling of the assembler is 
described further below. The currentaddress variable is 

25 recalculated to take into account the instruction's size (block 
420). 

When all instructions (block 421) in all blocks (block 

415) in the component have been examined, every instruc- 
tion and block has been assigned a working size, and every 

30 block has been assigned a relative working starting address. 
In an alternate embodiment in which the binary is divided 
into sections and each section must be assigned on a 
particular byte boundary, the priming loop is performed on 

35 a section by section basis throughout the entire IR, read- 
justing the starting address of the first block in each section 
to lie on the correct boundary. 

The verification loop illustrated in FIG. 4C checks for the 
correctness of the size of all relative instructions within the 

40 component. If no size adjustments are required, the working 
addresses are the correct addresses and no address recalcu- 
lation needs to be done. Therefore, the verification loop uses 
a "stop" flag to determine that no change is required. If the 
value of the stop flag is true when the verification loop has 

45 reexamined each instruction in the component, no recalcu- 
lation is necessary. 

The verification loop examines each block in the compo- 
nent (block 433). If it is a code block (block 434), each 
instruction in the code block is examined to determine if it 

50 is a relative instruction (block 436). A new size for each 
relative instruction is calculated using the working block 
sizes assigned by the priming loop (block 437). If the new 
size is different than that stored in the corresponding IR 
instruction element, i.e., long rather than short, (block 438), 

55 the new size is stored (block 439) and the stop flag is set to 
false to indicate that the block sizes and addresses must be 
recomputed. 

Once all instructions (block 441) in all code blocks (block 
442) have been reexamined, the currentaddress variable is 

60 set to the starting address of the component (block 443) and 
the stop flag is tested (block 444). If the stop flag is true 
(block 444 and 431), the working addresses assigned by the 
priming loop are correct and become the absolute addresses 
for the component. 

65 If the stop flag is false (block 444), at least one relative 
instruction has changed in size from short to long, so its 
block has changed size and the addresses of all blocks within 
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the component must be recalculated. Each block in the 
component is once again examined (block 445) and its 
address recomputed based on the new sizes for the instruc- 
tions (block 446). The stop flag is set to true (448) when 
addresses have been recomputed for all the blocks (block 5 
447) to end the verification loop. At this point the IR is 
complete with absolute addresses. If a relative instruction 
changes from the long form to the short form, the linker/ 
writer method 400 does not revert back to the short form 
because this would introduce an infinite loop in the process. 1Q 
By prohibiting an instruction from shrinking back, the 
assign addresses loop illustrated in FIG. 4B is guaranteed to 
converge and terminate within a "reasonable" time period. 

Returning now to FIG. 4A, the linker/writer method 400 
creates, or updates, the emitted block information data 15 
structure (204 or 205 in FIG. 2A) for each block (block 404). 
One exemplary embodiment of the emitted block informa- 
tion is shown in FIG. 7 as having a block header 715 and a 
set of symbol table entries 720. The block header 715 
contains a block address field 701, a block size field 702, an 20 
alignment field 703, and a set of flags 704-710. The block 
address field 701 is the absolute address for the block within 
the component assigned by the verification loop in FIG. 4C. 
The block size field 702 contains the size of the block 
computed by the verification loop. Table 1 defines the flags ^ 
703-709 for the present exemplary embodiment of the 
header. 



TABLE 1 



Flag 


Block Type 


Description 


IsData 


Code or 


defines block as code or data 




Data 




IsCainferget 


Code 


whether block contains an entry point 






for a procedure 


Islnstrumentablc 


Code 


whether block can have instrumentation 






added by user 


IsUnrcachable 


Code or 


whether block can be reached from 




Data 


another block in the binary 


IsNoReturn 


Code 


whether block transfers control and does 






not return to calling block 


Alignment 


Code or 


boundary on which to align the block, 




Data 


if any 


Assembler 


Code 


code architecture 



30 



The linker/writer method 400 creates a large buffer into 
which the platform-specified code will be emitted (block 45 
405). The method walks through the IR block by block, 
calling the appropriate assembler to emit the code as defined 
within the IR block element. For IR instructions that fit a 
basic format, such as "load/' "store," or "add," the linker/ 
writer method 400 submits the appropriate parameters to the 50 
assembler and the assembler passes back the corresponding 
platform-specific instruction, which is placed within the 
buffer at the address for the instruction. For IR instructions 
that do not fit the basic format, referred to as extended 
version or mode IR instructions, the IR instruction element 55 
is marked the signature flag of the original architecture. The 
linker/writer method 400 first determines if the instruction is 
to be emitted for its original platform. If so, the appropriate 
parameters from the IR instruction element are passed to the 
assembler just as with the basic format IR instruction. If the 60 
architecture is changing, the linker/writer method 400 uses 
an translation data structure, such as a hash table, to deter- 
mine a set of basic format IR instructions that perform the 
function of the extended mode IR instruction. Each of those 
IR instructions is passed into the assembler to be emitted as 65 
the corresponding platform-specific instruction. There is a 
translation data structure for each architecture supported by 
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the system and each translation data structure contains 
entries for only those platform-specific instructions that 
cannot be represented by the basic IR instruction form. The 
translation data structures are created outside the system and 
the information contained therein will be readily apparent to 
one skilled in the art. An exemplary embodiment of the IR 
instruction elements in both the basic form and extended 
version is described in detail in the related "Translation and 
Transformation" patent application. 

After the buffer has been fully populated with instructions 
for the binary, at block 405 the linker/writer method 400 also 
updates the source and target addresses that are stored within 
a symbol table or a program database (PDB file 205 in FIG. 
2A) that contain entry points, export entry tables, jump 
tables, and symbol tables. In one embodiment the informa- 
tion for a source or target address is a displacement into a 
block instead of being the first address within the block. 

The linker/writer method 400 concludes by writing out 
the buffer to an output file specified by the user to serve as 
the binary for the component (block 406). The emitted block 
data structures 700 are written to the output file for the 
binary (as illustrated by emitted data block information 204 
in FIG. 2A) or to the PDB file 205. Additional internal use 
information is also written to the output file at this time. In 
one embodiment in which the emitted block information is 
stored in the output file, the internal use information speci- 
fies the offset within the output file at which the emitted 
block information begins. The internal use information can 
also contains data such as extended debugging information. 
In another embodiment in which the emitted block infor- 
mation is written to a separate file, such as the PDB file 205, 
the internal use information is written to the same file. In yet 
another alternate embodiment, the internal use information 
is written to a completely separate file that is associated with 
the emitted block information file. 

The particular methods performed by computer in execut- 
ing an exemplary embodiment of the output translator 240 
have been described with reference to flowcharts including 
all the acts from 401 until 406, 410 until 421, and 430 until 
448. In addition, an exemplary embodiment of an emitted 
block information data structure created by the output trans- 
lator 240 has been illustrated. 

Conclusion 

An output translator has been described that provides for 
cross module representations of components within a het- 
erogeneous program. The heterogeneous program is trans- 
lated into an intermediate representation that is a hierarchy 
of platform-neutral elements that correspond to instructions, 
code blocks, procedures and components within the pro- 
gram. Blocks of instructions that were originally written for 
one architecture can be translated from the intermediate 
representation into platform-specific instructions for a dif- 
ferent architecture. The output translator provides any nec- 
essary prologue and/or epilogue code to interface contiguous 
code blocks that are emitted in different instruction sets. 
Furthermore, for an architecture that has both short and long 
forms for instructions, the output translator initially assumes 
the emitted instruction will be in its short form and only 
substitutes the long form when required, the output code can 
be more compact than using the long form as the default 
output format. The choice of the short form for the default 
also introduces efficiencies into the output translator process 
because working relative addresses assigned by the priming 
loop are highly likely to be the absolute address within the 
output binary, reducing the number of iterations that must be 
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performed to emit the code. Finally, because the code blocks 
within the components can be translated into different archi- 
tectures than they were originally written enables faster 
execution of the program. 

Although specific embodiments have been illustrated and 5 
described herein, it will be appreciated by those of ordinary 
skill in the art that any arrangement which is calculated to 
achieve the same purpose may be substituted for the specific 
embodiments shown. This application is intended to cover 
any adaptations or variations of the present invention. 30 

For example, those of ordinary skill within the art will 
appreciate that the division of the output translator into a 
linker and a writer module, and the methods that each 
perform, can be allocated differently without changing the 
functions performed by the output translator. Furthermore, 15 
those of ordinary skill within the art will appreciate that the 
translation from the IR instructions into the platform- 
specific instructions can be accomplished through the use of 
look-up tables, hashing function, or database records. 

The terminology used in this application with respect to is 20 
meant to include all of these architectural environments. 
Therefore, it is manifestly intended that this invention be 
limited only by the following claims and equivalents 
thereof. 

We claim: 25 

1. A computerized method for translating a heterogeneous 
program into different architectures comprising: 

reading a heterogeneous program having a plurality of 
executable components in different forms; 3Q 

obtaining a platform-neutral intermediate representation 
of a component in the heterogeneous program; 

creating logical linkages among a plurality of code blocks 
in the intermediate representation of the component to 
establish a control flow through the component; 35 

assigning an absolute address within an address space for 
the component to each of the plurality of code blocks; 

resolving the logical linkages to the absolute addresses for 
the corresponding code blocks; 

emitting a platform-specific executable instruction for 40 
each instruction represented in the intermediate repre- 
sentation of the component into a buffer; and 

writing the buffer to an output file to create a new version 
of the component. 

2. The computerized method of claim 1, further compris- 45 
ing: 

inserting interface code between contiguous code blocks 
having instructions emitted for different architectures. 

3. The computerized method of claim 1, further compris- 5Q 
ing: 

replacing an entry point to a first code block with an point 
to a second code block introduced by a user to substi- 
tute for the first code block. 

4. The computerized method of claim 1, further compris- 5S 
ing: 

generating information defining an emitted block; and 
associating the emitted block information with the new 
version of the component. 

5. The computerized method of claim 1, wherein assign- 60 
ing an absolute address comprises: 

assigning a working address and working size to each 
block in the intermediate representation of the compo- 
nent; 

assigning the working address as the absolute address if 65 
the working size of each code block is accurate for a 
corresponding emitted code block; and 
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calculating the absolute address if the working size of 
each code block is not accurate for the corresponding 
emitted code block. 

6. The computerized method of claim 5, wherein assign- 
ing the working address and working size comprises: 

determining an emitted size for each instruction in each 
code block that is a regular instruction; and 

assigning a fixed size for a short form of each instruction 
in each code block that is a relative instruction. 

7. The computerized method of claim 6, wherein the 
working size of each code block is accurate for the corre- 
sponding emitted code block if the short form is valid for 
each relative instruction within the code block. 

8. The computerized method of claim 1, wherein emitting 
a platform-specific instruction comprises: 

inputting parameters defining the intermediate represen- 
tation of the instruction to an assembler; and 

storing the corresponding platform-specific instruction 
generated by the assembler into the buffer. 

9. The computerized method of claim 8, further compris- 
ing: 

obtaining the parameters from a translation data structure. 

10. The computerized method of claim 1 wherein obtain- 
ing comprises: 

parsing an executable component in the heterogeneous 

program into basic code blocks; and 
creating an intermediate representation of the basic code 

blocks, the intermediate representation comprising a 

hierarchy of instructions. 

11. The computerized method of claim 10 wherein the 
step of creating an intermediate representation comprises: 

annotating the intermediate representation with user 
names. 

12. A computer-readable medium having computer- 
executable instructions to cause a computer to perform an 
output translation method on an intermediate representation 
for a component comprising: 

resolving logical references within the intermediate rep- 
resentation to absolute addresses in an address space 
for the component; 

emitting instructions for the component in a plurality of 
platform-specific instruction sets defined by the inter- 
mediate representation; and 

inserting interface instructions in the intermediate repre- 
sentation between code blocks marked to be emitted in 
different platform-specific instruction sets, the interface 
instructions enabling a transition from one platform- 
specific instruction set to a different platform-specific 
instruction set. 

13. The computer-readable medium of claim 12, wherein 
resolving logical references comprises: 

creating working addresses for a plurality of code blocks 

in the intermediate representation; 
assigning absolute addresses for the plurality of code 

blocks based on the working addresses; and 
resolving each logical reference to one of the plurality of 

code blocks to the absolute address for the one of the 

plurality of code blocks. 

14. A computer-readable medium having stored thereon 
an emitted block information data structure for describing a 
code block emitted by a cross-module representation 
system, the data structure comprising: 

an address field containing data representing a starting 
address for the emitted block; 
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a size field containing data representing a size of the block 
starting at the address in the address field; 

a flag field containing data representing a set of informa- 
tion flags for the block starting at the address in the 
address field; and 5 

a symbol entry containing data representing a symbol 
name and a symbol address for a symbol appearing in 
the block starting at the address in the address field, 
wherein the emitted block and the data structure are 
accessible by the cross-module representation system 10 
to facilitate optimization of the emitted block. 

15. The computer-readable medium of claim 14, further 
comprising: 

an alignment field containing data representing an align- fi 
ment boundary for the address in the address field. 

16. The computer-readable medium of claim 14, wherein 
the information flags are selected from the group consisting 
of: 

a call target flag; 20 

a begin procedure flag; 

a no split flag; 

an instrumentable flag; 

an unreachable flag; 25 
a data flag; and 
a no return flag. 

17. A computerized system comprising: 
a processing unit; 

a system memory coupled to the processing unit through 30 
a system bus; 

a computer-readable medium coupled to the processing 
unit through a system bus; 

a reader module reading a heterogeneous program com- 35 
prising a plurality of executable components written for 
two or more computer architectures; 

a platform-neutral intermediate representation of the het- 
erogeneous program in the system memory; and 

an output translator module executed from the computer- 40 
readable medium by the processing unit, wherein the 
output translator module causes the processing unit to 
translate the intermediate representation into a set of 
platform-specific instructions that accomplish the func- 
tion of the heterogeneous program. 
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18. The computerized system of claim 17, wherein the set 
of platform-specific instructions includes subsets of 
platform-specific instructions with the subsets being for 
different platforms. 

19. The computerized system of claim 18, wherein the 
output translator module further causes the processing unit 
to insert at least one platform-neutral instruction in the 
intermediate representation to provide an interface between 
the subsets of platform-specific instructions. 

20. The computerized system of claim 17, wherein the 
output translator module further causes the processing unit 
to translate at least one instruction in the intermediate 
representation into a platform-specific instruction different 
than a corresponding platform-specific instruction in the 
heterogeneous program. 

21. The computerized system of claim 17, wherein the 
output translator module further causes the processing unit 
to activate a platform-specific assembler to generate the set 
of platform-specific instructions. 

22. A computer-readable medium having computer- 
executable instructions stored thereon for performing a 
method comprising: 

reading a heterogeneous program comprising a plurality 
of executable components written in two or more 
instruction sets; 

translating an intermediate representation of the hetero- 
geneous program into a set of platform -specific instruc- 
tions; and 

writing the platform-specific instructions onto a 
computer-readable medium to create a new version of 
the heterogeneous program. 

23. The computer-readable medium of claim 22, wherein 
translating the intermediate representation comprises: 

emitting an instruction in the intermediate representation 
into a platform-specific instruction different than a 
corresponding platform-specific instruction in the het- 
erogeneous program. 

24. The computer-readable medium of claim 23, further 
comprising: 

inserting interface instructions in the intermediate repre- 
sentation between contiguous instructions emitted for 
different platforms. 

***** 



08/27/2004, EAST Version: 1.4.1 



